Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrei Mikheev

HCRC, Edinburgh University

Towards a Workbench for Acquisition of Domain Knowledge from Natural Language

Apr 30, 1996

Andrei Mikheev, Steven Finch

Figure 1 for Towards a Workbench for Acquisition of Domain Knowledge from Natural Language

Abstract:In this paper we describe an architecture and functionality of main components of a workbench for an acquisition of domain knowledge from large text corpora. The workbench supports an incremental process of corpus analysis starting from a rough automatic extraction and organization of lexico-semantic regularities and ending with a computer supported analysis of extracted data and a semi-automatic refinement of obtained hypotheses. For doing this the workbench employs methods from computational linguistics, information retrieval and knowledge engineering. Although the workbench is currently under implementation some of its components are already implemented and their performance is illustrated with samples from engineering for a medical domain.

* 8 pages, compressed postscript; Proceedings of EACL-95 Dublin, Ireland

Via

Access Paper or Ask Questions

Unsupervised Learning of Word-Category Guessing Rules

Apr 30, 1996

Andrei Mikheev

Figure 1 for Unsupervised Learning of Word-Category Guessing Rules

Figure 2 for Unsupervised Learning of Word-Category Guessing Rules

Figure 3 for Unsupervised Learning of Word-Category Guessing Rules

Figure 4 for Unsupervised Learning of Word-Category Guessing Rules

Abstract:Words unknown to the lexicon present a substantial problem to part-of-speech tagging. In this paper we present a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words. Three complementary sets of word-guessing rules are induced from the lexicon and a raw corpus: prefix morphological rules, suffix morphological rules and ending-guessing rules. The learning was performed on the Brown Corpus data and rule-sets, with a highly competitive performance, were produced and compared with the state-of-the-art.

* 8 pages, LaTeX (aclap.sty for ACL-96); Proceedings of ACL-96 Santa Cruz, USA; also see cmp-lg/9604025

Via

Access Paper or Ask Questions

Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations

Apr 30, 1996

Andrei Mikheev

Figure 1 for Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations

Figure 2 for Learning Part-of-Speech Guessing Rules from Lexicon: Extension to Non-Concatenative Operations

Abstract:One of the problems in part-of-speech tagging of real-word texts is that of unknown to the lexicon words. In Mikheev (ACL-96 cmp-lg/9604022), a technique for fully unsupervised statistical acquisition of rules which guess possible parts-of-speech for unknown words was proposed. One of the over-simplification assumed by this learning technique was the acquisition of morphological rules which obey only simple concatenative regularities of the main word with an affix. In this paper we extend this technique to the non-concatenative cases of suffixation and assess the gain in the performance.

* 6 pages, LaTeX (colap.sty for COLING-96); to appear in Proceedings of COLING-96

Via

Access Paper or Ask Questions